Survey on Scalable Failure Detectors

نویسنده

  • Shihui Song
چکیده

Maintaining a timely view of the current system status is essential to the performance and functionality of distributed systems. Failure detectors have long been essential to distributed systems. In this paper, we evaluate two failure detection algorithms specifically aimed at large-scale systems. Both assume fail-stop (non-Byzantine) models but the similarities end there. Dynamo’s failure detector relies on pinging with a weak eventual completeness model based on randomization. On the other hand, the classic gossip-protocol is based instead on heartbeats with a strong completeness model. Our simulations test the conclusions of Dynamo’s failure detector in order to evaluate its gains from the traditional gossip heartbeat style approaches. We end with remarks on the advantages of both algorithms and the systems that are best suited for each.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Failure Detectors for Large-Scale Distributed Systems

This paper discusses the problem of implementing a scalable failure detection service for Grid systems. More specifically, traditional implementations of failure detectors are often tuned for running over local networks and fail to address some important problems found in wide-area distributed systems, such as Grid systems. We identify some of the most important problems raised in the context o...

متن کامل

An MPI failure detector over PMPI 1

Fault Detectors are valuable services which provide information about process failures in largescale parallel systems. Previous many studies suggest guidelines for the implementation of a fault detector. However, a practical approach to implementation is another challenge due to various parallel system environments of both hardware and software. This study explains that Fault detector is able t...

متن کامل

Implementation and performance evaluation of an adaptable failure detector

Chandra and Toueg introduced the concept of unreliable failure detectors. They showed how, by adding these detectors to an asynchronous system, it is possible to solve the Consensus problem. In this paper, we propose a new implementation of an Eventually Perfect failure detector (}P ). This implementation is a variant of heartbeat failure detector which is adaptable and can support scalable app...

متن کامل

On-chip detection of non-classical light by scalable integration of single-photon detectors

Photonic-integrated circuits have emerged as a scalable platform for complex quantum systems. A central goal is to integrate single-photon detectors to reduce optical losses, latency and wiring complexity associated with off-chip detectors. Superconducting nanowire single-photon detectors (SNSPDs) are particularly attractive because of high detection efficiency, sub-50-ps jitter and nanosecond-...

متن کامل

Authorization models for secure information sharing: a survey and research agenda

This article presents a survey of authorization models and considers their 'fitness-for-purpose' in facilitating information sharing. Network-supported information sharing is an important technical capability that underpins collaboration in support of dynamic and unpredictable activities such as emergency response, national security, infrastructure protection, supply chain integration and emerg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014